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Abstract 

PAS-11,  a  computer  program  which  represents  a 
generalized  version  of  an  automatic  protocol  system 
(PAS-1)  is  described.  PAS  -  II  is  a  task-free,  inter¬ 
active,  modular  data  analysis  system  for  inferring 
the  information  processes  used  by  a  human  from  his 
verbal  behavior  while  solving  a  problem.  The  output 
of  the  program  is  a  Problem  Behavior  Graph:  a  descrip¬ 
tion  of  the  subject's  changing  knowledge  state  during 
problem  solving.  As  an  example  of  system  operation 
the  PAS-11  anal  ,is  of  a  short  cryptar ithmetic  pro¬ 
tocol  is  presented. 

1.  ntroductlon 

Automatic  protocol  analysis  is  a  joint  effort 
by  man  and  machine  to  infer  from  the  record  of  the 
time  course  of  a  subject's  behavior,  the  underlying 
information  processes.  As  developed  (5),  it  usually 
refers  to  the  verbalizations  of  a  subject  solving 
some  problem  under  instructions  to  think  out  loud. 
Protocol  analysis  designates  the  full  range  of  activ¬ 
ities  engaged  in  by  the  psychologist  when  working 
with  protocols:  description  of  the  subject's 
behavior  according  to  an  hypothesized  model,  induc¬ 
tion  of  new  rules,  derivatio.  of  consequences  from 
a  model  in  the  context  of  specific  data,  and  measure¬ 
ment  of  adequacy  of  a  model.  The  initial  focus  if 
our  work  has  been  behavior  description  in  terms  of 
information  processes,  given  an  hypothesized  general 
model  (the  so-called  problem  space  in  which  the 
subject  operates) . 

Ihe  PAS-I  system  (14,  15)  was  our  first  attempt 
at  automatic  protocol  analysis.  This  is  a  fully 
automatic,  non-interactive,  specialized  system  de¬ 
signed  to  analyze  cryptarithmetic  protocols  and  pro¬ 
duce  as  output  a  problem  behavior  graph  (PBG) describ¬ 
ing  the  subject's  sertch  through  a  posited  problem 
space.  The  protocol  analysis  is  represented  as  a 
sequence  of  processing  stages  that  ev.  tually  trans¬ 
form  he  raw  pro'ocol  into  a  problem  behavior  graph. 

At  eac.i  stage  rules  are  applied  which  effect  a  trans¬ 
formation  of  the  data.  The  organization  of  PAS-I  is 
shown  in  Figure  1. 

PAS-1  has  successfully  analyzed  protocols  from 
DONALD+GERALD=ROBERT  and  CROSS+ROADS=DANGER  crypt¬ 
arithmetic  problems.  The  results  obtained  in  the 
DONALD+CERALDc ROBERT  task  for  two  of  the  subjects 
have  been  discussed  in  detail  (15)  and  demonstrate 
that  this  approach  to  automatic  protocol  analysis  is 
both  feasible  and  rewarding. 

Encouraged  by  the  success  of  PAS-1  we  have 
designed  and  built  an  improved  version  called  PAS-11. 
PAS- 1 1  was  designed  with  two  major  goals  in  mind:  to 
make  it  interactive  and  task  free  by  interactive 
we  mean  that  the  user  is  permitted  to  take  an  active 
part  in  the  analysis:  he  can  provide  answers  to  sub¬ 
problems  the  system  is  unable  to  solve,  correct  proc¬ 
essing  errors,  and  even  maintain  control  over  the 
processing  sequence.  Clearly,  real-time  Interaction 
of  this  sort  makes  the  system  a  more  powerful  tool 
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for  protocol  analysis.  By  task  free  we  mean  that 
the  system  is  independent  of  any  particular  problem 
domain.  To  make  PAS-11  task  free  we  partitioned  the 
system  into  two  parts:  the  problem  dependent  part 
consisting  of  the  processing  rules  or  heuristics  used 
at  each  stage  of  the  analysis,  and  the  problem 
Independent  part  consisting  of  the  general  control 
structure  and  command  language.  Thus,  to  apply  the 
system  to  a  protocol  in  a  new  problem  area  the  user 
must  first  supply  the  system  with  processing  rules 
for  that  domain.  The  design  of  PAS-11  also  included 
four  subgoals:  to  make  the  system  transparent, 
modifiable,  extendable,  and  open  (see  Figure  2). 

I  wo  important  implementat’’  on  issues  were  not 
addressed  in  the  design  of  PAS-11.  1).  Improve  system 

performance  in  cryptarithmetic.  This  includas 
expanding  the  deductive  and  inductive  inference 
capabilities,  and  "fine  tuning"  the  system  by 
optimizing  the  processing  heuristics  to  produce  the 
best  possible  analysis  within  the  given  framework. 

2).  Extend  the  scope,  of  the  analysis.  For  example, 
extend  the  system  back  to  handle  the  speech  recog¬ 
nition  and  segmentation  problems  inherent  in  producing 
a  transcription  from  the  audio  tape.  Or  extend  the 
system  to  handle  the  problem  of  inducing  the  problem 
space  from  the  protocol  or  inducing  a  production 
system  model  from  the  problem  behavior  graph. 

It  was  decided  to  make  PAS-II  interactive  and 
task  free,  postponing  the  problems  of  increasing 
power  in  a  particular  task  or  broadening  the  scope 
of  the  analysis.  This  decision  was  influenced  by 
the  desire  to  provide  a  working  tool  for  protocol 
analysis  that  could  be  used  by  participants  at  a 
workshop  on  New  Techniques  in  Cognitive  Research  held 
at  CMU  in  the  summer  of  1972  (7).  The  PAS-11  system  is 
currently  running  in  LISP  at  CMU  on  a  PDP-10  and  is 
available  to  the  CMU  (and  the  ARPA  Network)  community . 

This  paper  is  organized  as  follows.  The  task  of 
protocol  analysis  is  discussed  in  Section  2.  This  is 
followed  in  Section  3  by  a  brief  description  of  the 
structure  of  the  program  and  in  Section  4b  an 
example  of  its  use  in  analyzing  a  cryptariti.metic 
protocol.  Section  5  concludes  with  a  discussion  of 
the  general  executive  structure  of  the  system  and 
its  implication  for  AI  data  analysis  programs. 

2,  Task  of  Protocol  Analysis 

Protocol  analysis  is  a  complex  data  processing 
iask  requiring  both  onductive  and  inductive  inference 
capabilities.  Our  current  approach  to  protocol  analy¬ 
sis  is  based  on  a  particular  theory  of  human  problem 
solving.  For  a  description  of  this  theory  and  an 
introduction  to  the  task  of  protocol  analysis  see 
Newell  and  Simon  (5)  . 


Ultimately,  a  library  containing  processing  rules 
for  a  number  of  different  problem  domains  will  be 
available  to  the  user. 
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Figure  1.  Flow  Diagram  of  PAS-I. 
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Interactive:  User  and  system  exchange  information  during  processing. 
Task  Free  :  System  is  independent  of  any  particular  problem  domain. 


SUBCOALS 

Transparent : 

Modifiable  : 


Extendable  : 
Open  : 


System  is  easy  to  use  and  understand  by  virtue  of  a  clean 
organization  and  the  ability  to  explain  itself. 

.lasic  changes  in  the  data  processing  procedure  can  be  made 
by  a  user  with  no  knowledge  of  the  language  used  to  program 
the  system. 

The  programmer  can  easily  enlarge  the  system  to  encompass 
a  wider  range  of  the  data  analysis. 

The  user,  rather  than  the  program,  initiates  and  controls 
the  interaction  and  accordingly  gains  ultimate  control  of 
the  processing  sequence. 


Figure  2.  Design  Considerations  for  PAS-11. 


I  heoretical  Suhstrur  tore 


group  is  defined  to  be  an  operator  together  with  its 
input  and  output  knowledge  elements. 


rcobiem  Space.  We  assume  human  problem  solvir 
takes  place  by  search  in  a  problem  space.  The  ele¬ 
ments  of  this  space  are  the  possible  states  of  kn-.wl- 
ed^e  the  subject  can  have  about  the  task,  where  a 
state  of  knowledge  is  simply  an  expression  of  what  the 
subject  knows  at  some  particular  point  in  the  space, 
besides  knowledge  states,  the  problem  space  also  in¬ 
cludes  a  set  of  operators .  These  define  operations 
the  subject  can  perform  on  knowledge  at  a  particular 
state  to  yield  new  knowledge  —  hence  to  move  to  a 
new  knowledge  state.  The  operators  are  incremental, 
that  is,  they  take  as  input  a  small  portion  of  the 
total  knowledge  state  (a  small  set  of  knowledge  ele¬ 
ments!  and  produce  as  output  new  knowledge  elements. 

Problem  Behavior  Graph.  The  subject's  search 
through  the  problem  space  for  a  solution  can  be  des¬ 
cribed  as  a  sequence  of  operator  applications  that  cre¬ 
ate  a  string  of  incrementally  changing  knowledge  states. 
The  plot  of  this  search  is  called  the  problem  behavior 
graph  (PBG1 .  Figure  8  (also  used  to  illustrate  the 
output  of  the  analysis  given  in  Section  41  shows  a 
problem  behavior  graph  for  cryptarithmetic .  The  nodes 
represent  operator  applications:  the  knowledge  ele¬ 
ments  at  the  lower  left  of  each  node  are  the  inputs, 
those  at  the  lower  right  are  the  outputs.  PUG 
branching  results  from  the  subject  abandoning  infor¬ 
mation  and  returning  to  a  prior  knowledge  state 
(usually  because  of  a  discovered  contradiction) .  For 
example,  in  Figure  8  the  outputs  of  nodes  4  and  6 
onflict:  "R  is  4"  conflicts  with  "R  is  odd,"  and 
leads  to  the  abandonment  of  nodes  4,  5  and  6.  Note 
that  the  knowledge  state  at  any  point  in  the  graph  is 
the  conjunction  of  all  output  elements  on  the  path  from 
the  given  point  back  to  the  beginning  of  the  grarn. 

All  nodes  on  the  path  from  the  last  node  back  to  the 
beginning  of  the  graph  are  called  currently  active 
nodes.  Their  output  elements  define  the  current 
knowledge  state. 

Data  Analysis 

The  data  being  analyzed  is  the  transcribed  text 
of  a  subject's  verbal  protocol.  As  the  text  is  trans¬ 
formed  into  a  PBG  it  is  subjected  to  four  major  tvpcs 
of  processing-  linguistic,  semantic,  group,  and  PBG. 
Figure  1  typifies  such  a  processing  sequence. 

Linguistic  Processing.  The  text  is  first 
segmented  into  shorter  strings  called  topic  segments, 
each  of  which  is  expected  to  ultimately  yield  approxi¬ 
mately  one  problem  space  element.  Each  segment  is 
then  parsed  using  a  grammar  sensitive  to  the  problem 
domain  under  consideration.  The  result  of  parsing  is 
a  set  of  semantic  elements  whicli  represent  the  meaning 
of  the  segment.  For  example,  the  segment  "Ij  is  not 
equal  to  6"  might  yield  the  elements  (NEC,)  (EQ  D  6)  in 
the  cryptarithmetic  task.  Here  (NEC)  is  called 
an  indicator  element,  (EQ  D  6)  a  knowledge  element. 

Semantic  Processing.  The  semantic  elements 
produced  through  parsing  are  first  combined  in  very 
elementary  ways  to  produce  new  elements,  i.e.,  (NEC) 
and  (EQ  D  6)  become  (NEQ  D  6)  .  Next,  new  elements 
reflecting  relationships  between  elements  from 
adjacent  segments  are  produced.  Thus,  (EQ  D  5)  from 
one  segment  and  (THEREFORE)  (F.Q  T  0)  from  the  next 
segment  become  (BECAUSEOF  (EQ  D  5) (EQ  T  0)),  e . g . , 
"because  D  is  8,  T  is  0."  Finally,  these  elements  are 
arranged  into  initial  approximationsof  operator  groups, 
each  containing  an  operator  element  and  the  surround¬ 
ing  knowledge  and  indicator  elements.  An  operator 


Croup  Processing.  The  tentative  operator  groups 
produced  during  semantic  processing  are  now  analyzed 
to  obtain  a  complete  picture  of  what  the  subject  knows 
at  each  moment  and  what  operators  he  applies.  First, 
variables  in  semantic  elements  are  identified  by  com¬ 
paring  the  elements  to  the  current  context  as  defined 
by  the  PBG.  Thus  if  (EQ  1)  5)  were  in  the  PBG  then 
when  given  the  element  (EQ  <L>  5)  ,  where  <L>  stands 
for  a  class  of  letters,  we  recognize  that  <3^  in  this 
case  is  the  letter  D. 

The  second  part  of  gcujp  processing  consists  of 
finding,  or  hypothesizing,  the  origin  of  every  knowl¬ 
edge  element  in  each  tentative  group.  The  origin  of 
a  knowledge  element  is  defined  to  be  the  operator 
which  produced  it,  plus  the  inputs  to  that  operator, 
plus  the  operators  which  produced  those  inputs,  etc. 
Thus  the  origin  can  be  represented  as  a  tree  which 
defines  a  collection  of  overlapping  operator  groups. 

PBG  Processing.  The  operator  groups  produced 
during  group  processing  are  now  incorporated  into  the 
PBG.  In  general,  each  group  becomes  a  node  in  the 
PBG.  In  the  simplest  case  the  new  node  is  just 
attached  to  the  last  currently  active  node.  However, 
when  contradictions  occur  (the  output  of  one  node 
conflicts  with  the  output  of  another)  restructuring 
occurs  to  eliminatethe  conflict  (see  Figure  8). 

3-  Structure  of  the  Program 

PAS-11  takes  as  input  a  transcribed  text  of  the 
verbalization  oi  a  subject  solving  a  problem  and 
produces  as  output  a  PBG.  The  processing  rules  for 
the  various  stages,  including  the  rules  defining  the 
problem  space,  are  given  to  the  system.  These  rules 
are  supplied  either  by  the  system  builder  via  a 
library  of  rules  for  various  problem  domains  or  by 
the  user  himself. 

Modular  r  rue  ture 

PAS-11  is  organized  as  a  modular  data  analysis 
system.  The  basic  unit  of  organization  is  the  mode/, 
a  processing  state  which  has  associated  with  it  a 
buffer  capable  of  holding  rules  or  data.  This  buffer 
can  be  modified  by  the  editing  functions  available  in 
the  command  language.  There  are  three  types  of  modes: 
run  modes,  which  hold  the  data  being  analyzed  rule 
modes,  which  hold  the  processing  rules,  and  auxiliary 
modes ,  whicli  hold  task-f^ee  system-oriented  riles. 

Thus  the  information  in  the  rule  modes  constitutes  the 
problem  dependent  part  of  the  system. 

The  next  level  of  organization  is  the  stage:  a 
unit  consisting  of  one  run  mod"  and  any  number  of 
associated  rule  modes.  Data  pLOcessing  is  performed 
in  a  stage  by  applying  the  rules  from  the  rule  modes 
associated  with  tha  tage  to  the  data  present  in  the 
run  mode  of  the  previous  stage.  The  result  of  the 
processing  is  then  put  into  the  run  mode  of  the  current 
stage.  Figure  3  illustrates  the  modular  organization 
of  PAS-11,  with  the  arrows  indicating  data  flow  and 
the  lines  indicating  mode  associations. 

The  highest  level  of  organization  is  the 
processor:  a  unit  consisting  of  consecutive  stages 

in  the  control  cycle.  For  example,  in  PAS-11  two 
linguistic  stages  form  the  Linguistic  processor  and 
three  semantic  stages  form  the  Semantic  processor. 


Modes.  The  modes  currently  implemented  in  l*AS- 
II  are  listed  in  lable  1.  Note  that  most  run  modes 
have  one  or  two  rules  modes  associated  with  them. 

This  association  is  illustrated  in  Table  1  and  also 
in  Figure  3,  which  shows  the  modular  composition  of 
the  various  processors  in  PAS-11.  The  arrows  in  the 
figure  define  the  data  links  existing  between  modes. 
The  mode  at  the  tail  of  an  arrow  provides  the  data 
that  the  mode  at  the  head  of  the  arrow  processes.  For 
example,  processing  in  the  TOPIC  mode  involves  apply¬ 
ing  the  SEGMENTATION  rules  to  the  data  in  the  TEXT 
mode  and  then  placing  the  result  in  the  TOPIC  mode. 

As  each  line  In  TEXT  is  processed,  it  is  deleted  from 
the  TEXT  buffer.  However,  a  copy  of  these  deleted 
lines  is  stored  elsewhere  in  TEXT  and  can  be  re¬ 
trieved  (see  the  process  functions  in  Table  2).  The 
arrows  in  Figure  3  do  not  necessarily  define  the 
control  cycle,  i.e.,  the  order  in  which  processing 
occurs.  The  control  flow  is  illustrate  in  r«  f 
(to  be  discussed  later). 


MODES 

RUN 

RULE 

AUXILIARY 

TEXT 

ASSOC IAT ION 

TOPIC 

SEGMENTATION 

SAVE 

LINGUISTIC  1 

EXTRACTION 

CONTROL 

LINCUISTIC2 

SPACE,  GRAMMAR 

INFORMATION 

SEMANT1C1 

INTEGRATION 

SEMANTIC 2 

NORMALIZATION 

SEMANTIC 3 

GROUPING 

GRAPH  IC1 

UNKNOWNS 

GRAPHIC 2 

ORIGIN 

GRAPHIC3 

CONFLICT,  PBG 

TRACE 1 

TRACE 2 

PS,  MEMORY 

TRACE 3 

TRACE4 

MATCH 

Table  1.  PAS-11  Modes. 


Functions.  The  functions  currently  implemented 
in  PAS- I I  are  listed  in  Table  2.  They  constitute  the 
command  language  available  to  the  user,  and  are 
divided  into  four  categories:  basic ,  ^dit ,  flag,  and 
process  functions.  Note  that  a  mode  name  is  a 
function  that  puts  the  user  into  that  mode. 

A  function  call  consists  of  a  function  name 
followed  by  Its  arguments.  Any  number  of  function 
calls  may  occur  together.  If  it  is  not  clear  which 
names  are  the  functions  and  which  are  the  .*rguments, 
parentheses  can  be  used  for  disambiguation.  In 
ambiguous  cases  the  system  always  assumes  the  name 
is  a  function  name  rather  than  an  argument.  Thus  if 
the  user  types  HELP  TOPIC  DISPLAY  3  it  could  mean 
either  (HELP  TOPIC):  give  me  information  about  the 
TOPIC  mode,  and  (DISPLAY  3):  display  line  3  of  the 
current  buffer;  or  (HELP):  t&ll  me  how  to  get  help, 
(TOPIC):  put  me  into  the  TOPIC  mode,  and  (DISPLAY  3): 

display  line  3.  The  system  would  make  the  latter 
interpretation. 


Comparison  with  Figure  1  shows  how  PAS- II  iraps  onto 
PAS-I.  Note  that  the  scope  of  the  analysis  has 
been  extended  to  include  a  Trace  processor  (not 
discussed  in  detail  in  this  paper). 


Auxiliary  Modes.  There  are  four  auxiliary 
modes:  save ,  control ,  assoc iat ion .  and  information . 
The  SAVE  mode  contains  rules  which  specify  which 
mode  buffers  are  to  be  saved  on  (or  read  into  from) 
a  disk  file  when  the  WRITS  (or  READ)  command  is 
evecated.  The  CONTROL  mode  contains  rules  which 
define  the  control  cycle  for  the  system.  Initially 
these  rules  define  the  control  flow  shown  in  Figures 
3  and  4.  The  ASSOCIATION  mode  contains  rules  which 
define  the  associations  between  run  and  rule  modes. 

The  initial  (or  deLault)  associations  are  those 
shown  in  Figure  3.  The  CONTROL  and  ASSOCIATION  modes, 
together  with  the  CREATE  function,  permit  the  sophis¬ 
ticated  user  to  create  new  modes,  redefine  mode 
associations,  and  reorganize  the  contro1  flow  for 
'-ne  entire  system.  One  example  of  this  is  the  use  of 
a  reorganized  PAS-II  to  analyze  a  problem  description 
(problem  text)  in  natural  language  in  order  to  infer 
from  that  text  a  tentative  problem  space,  one  that  a 
subject  might  use  in  representing  the  problem  (2). 


The  INFORMATION  mode  is  unique  in  containing 
no  buffer  and  recognizing  none  of  the  functions  that 
constitute  the  command  language.  Instead,  this  mode 
responds  to  key  words  ir.  the  users  input,  which  may 
be  in  sentence  form.  The  mode  provides  the  user  with 
general  information  about  PAS-II:  its  basic  organi¬ 
zation,  purpose,  and  techniques  of  operation.  This  is 
to  be  contrasted  with  the  HELP  function,  which  pro¬ 
vides  the  user  with  specific,  on-the-spot  information 
about  the  mode  he  is  in. 

Control  Structure 

The  control  cycle  for  PAS-II  is  shown  in  the 
flow  diagram  of  Figure  4.  The  solid  arrows  indicate 
the  stage  that  is  entered  once  processing  in  the 
cu  rent  stage  is  finished.  The  broken  arrows  indicate 
which  stage  to  enter  before  processing  is  started. 
Processing  in  LINGUISTIC  1,  SEMANTIC 3,  and  GRAPHIC2  is 
incremental.  In  each  of  these  modes  only  part  of  the 
data  from  the  previous  mode  is  processed  at  one  time. 
This  initial  portion  of  the  data  is  then  carried 
through  the  rest  of  the  system,  leading  to  the  growth 
of  PBG  nodes,  before  the  rest  of  the  data  in  the 
previous  mode  is  processed.  This  is  done  to  establish 
a  semanlic  context  (the  PBG)  as  early  as  possible  in 
the  processing  sequence  so  it  can  provide  feedback 
needed  for  linguistic,  semantic,  and  group  processing? 

Since  the  control  organization  of  PAS-II  is 
quite  flexible,  the  user  is  under  no  constraints  to 
process  the  data  in  the  order  shown  in  Figure  4.  He 
may  skip  or  repeat  stages  within  the  existing  control 
framework,  and  may  redefine  the  control  cycle  (via 
the  CONTROL  mode) .  He  may  also  have  the  system  put 
him  into  the  next  run  mode  in  the  control  loop,  or 
even  automatically  step  him  through  the  run  modes, 
initiating  the  processing  at  each  stage  (see  NEXT 
and  AUTOMATIC  in  Table  2). 

Data  Processing 

Figures  3  and  4  show  the  processors  which  com¬ 
prise  the  control  cycle  of  PAS-II.  In  the  Topic 
processor  transcribed  text  is  segmented  into  phrases 
containing  only  a  single  task  topic.**  Then  in  the 
Linguistic  processor  an  initial  collection  of  these 


At  present  the  PBG  provides  feedback  for  group 
processing  only. 

b • 


This  is  a  slight  extension: 
mented  text  a s  input. 


PAS-I  recuires  seg 


NAME 


DESCRIPTION 


(mode  name) 
CREATE 
DISPLAY 
B  ERASE 

A  EXIT 

S  HELP 

I  MODE 

C  NEXT 

RULE 
RUN 


Puts  user  into  the  mode  named. 

Creates  a  new  mode. 

Displays  the  contents  of  M. 

Uncreates  M  (if  it  was  formed  using  CREATE). 

Takes  the  user  out  of  the  system  (to  LTSPi . 

Provides  system  information  pertinent  to  M. 

Tells  the  user  what  mode  he  is  it  . 

Puts  the  user  into  the  next  appropriate  -un  mode  of  C. 
Puts  the  user  into  the  rule  mode  associated  with  M. 
Puts  the  user  into  the  run  mode  associated  with  M. 


BREAK 
CONNECT 
E  DEFINE 

D  DELETE 

I  ED 

T  INSERT 

READ 
RENUMBER 
WRITE 


Breaks  a  line  in  M  into  two  or  more  smaller  lines. 
Connects  adjacent  lines  in  M  to  form  a  single  line. 
Permits  the  user  to  define  the  contents  of  lines  in  M . 
Deletes  lines  in  M. 

Enables  the  user  to  perform  intra-line  editing  in  M. 
Inserts  a  line  alter  a  given  line  in  M. 

Reads  data  from  a  disk  file  into  M. 

Renumbers  the  lines  in  M. 

Write  the  contents  of  M  onto  a  disk  file. 


AUTOMATIC 
BATCH 
COMMENT 
F  FAST 

L  HUSH 

A  NUMBERS 

G  PRINT 

SEARCH 
SUPPRESS 
TIME 
VERSI0N1 
VERSI0N2 


Steps  the  user  through  C,  executing  CO  in  each  run  mode. 

Stops  system  queries  during  run  mode  processing. 

Permits  comments  to  be  displayed  when  a  line  is  displayed. 
Speeds  up  reading  from  the  disk  by  eliminating  format  checking. 
Abbreviates  error  messages. 

Causes  disk  files  to  be  written  with  buifer  line  numbers. 

Puts  all  the  I  0  at  the  terminal  onto  a  disk  file. 

Causes  processing  to  be  repeated  until  no  rules  are  applicable. 
Suppresses  printing  of  auxiliary  information  during  processing. 
Causes  processing  time  in  M  to  be  printed. 

Causes  the  old  version  of  grammar 'parser  to  be  used. 

Causes  the  new  improved  version  of  grammar  par  er  to  be  used. 


AGAIN 

COPY 

GO 

RECOPY 

RESTART 

START 


Puts  the  data  in  M  into  P  and  fires  GO. 

Prints  the  copy  of  the  data  in  M. 

Processes  the  data  located  in  P  and  puts  the  result  into  M. 
Puts  the  copy  of  the  data  from  M  back  into  M. 

Puts  the  copy  of  the  data  from  P  back  into  P  and  fires  START. 
Deletes  the  daca  in  M  and  fires  GO. 


KEY 


mode  buffer  of  the  mode  the  user  is  in 
mode  buffer  prior  to  M  in  C 
control  cycle 


Table  2.  Description  of  PAS-il  Functions 
(Flag  descriptions  are  for  the  condition  flag  =  T) 


Figure  4.  Flow  diagram  of  PAS-.'I 

Key:  - - stage  to  enter  after  processing 

stage  to  enter  before  processing 


segments  is  parsed  yielding  sets  of  semantic  elements. 
Ihese  elements  are  processed  and  refined  in  the 
Semantic  processor  to  produce  groups  composed  of  one 
operator  element  and  its  associated  input  and  output 
knowledge  elements.  In  the  PUG  processor  these  groups 
are  incorporated  into  the  PBG.  The  Trace  processor 
is  then  used  to  compare  this  PBG  with  the  trace 
produced  hy  a  given  production  system  model  of  the 
subject. 

Topic  Processor.  The  Topic  processor  contains 
two  run  modes:  TEXT  and  TOPIC.  TEXT  is  an  initiali¬ 
sation  mode;  it  holds  the  data  for  TOPIC  to  process, 
ihus  no  real  processing  takes  place  in  it.  The 
TOPIC  mode  uses  the  SEGMENTATION  rules  to  segment  all 
the  text  in  the  TEXT  mode.  These  rules  have  the 
general  form:  stringy  1  string-  ,  where  a  string  is 
any  sequence  of  words,  punctuation  marks,  or  word 
classes  (as  defined  in  the  GRAMMAR  model,  including 
the  null  sequence.  The  slash  (/)  indicates  where  the 
text  is  to  be  broken,  i.e,,  after  every  occurrence 
of  stringy  that  is  immediately  followed  by  an  occur¬ 
rence  of  stringy  Figure  6  show  SEGMENTATION  rules 
for  cryptarithmetic  (to  be  used  in  the  example  in 
Section  4)  . 


Linguistic  Processor.  The  Linguistic  processor 
contains  two  run  modes:  L1NGU1ST1C1  and  LINGUIST 1C2 
In  LINGU1STIG1  the  EXTRACTION  rules  are  used  to  select 
a  consecutive  set  of  segments  from  TOPIC,  representing 
an  initial  guess  as  to  the  minimum  number  of  segments 
from  which  a  group  can  be  inferred.  Processing  con¬ 
sists  only  of  transferring  these  segments  from  the 
TOPIG  mode  to  the  L1NCU1ST1G1  mode.  At  present,  the 
EXTRACTION  rules  are  simply  a  single  integer  speci¬ 
fying  how  many  segments  to  transfer. 


Processing  in  the  LINGU1STIG2  mode  consists  of 
applying  the  SPACE  and  GRAMMAR  rules  to  all  the  topic 
segments  in  L1NGUISTIG1.  The  parsing  operation  pro¬ 
duces,  for  each  segment,  a  set  of  semantic  elements 
representing  the  meaning  of  the  segment.  The  rules 
in  the  SPACE  mode  define  the  problem  space  and  have 
the  form:  (semantic-element)  type,  where  a  semantic 

element  is  either  an  operator,  knowledge,  or  indicator 
element  and  the  type  is  either  OP,  KN,  or  IND.  The 
GRAMMAR  rules  define  a  key-word  grammar  and  have  the 
form:  <class>=  (itemu  item^  ...)  (itemn  item,, 

■  whete  an  item  is  either  a  class 
(denoted  by  angle  brackets)  or  a  literal  (such  as  a 
word,  letter,  or  character).  An  asterisk  (*)  can  be 
used  between  any  two  items  to  indicate  a  match  witli 


any  string  of  text,  and  any  GRAMMAR  rule  which  is  a 
disjunction  of  single  literals  can  be  written  without 
parentheses .  Figure  6  shows  SPACE  and  GRAMMAR  rules 
for  cryptarithmetic. 


Semantic  Processor.  The  Semantic  processor 
contains  three  run  modes:  SEMANTIC  1 ,  SEMANT1C2,  and 
SEMANT1G3 .  In  SEMANTIC!  the  INTEGRATION  rules  produce 
new  elements  by  combining  semantic  elements  generated 
from  the  same  or  adjacent  segments.  In  SEMANTIC 2  the 
NORMALIZATION  rules  map  knowledge  and  indicator  ele¬ 
ments  into  single  elements  reflecting  the  relationships 
existing  between  two  or  more  knowledge  elements.  In 
SEMANTIC I  a  tentative  operator  group  (protogroup)  is 
formed.  The  INTEGRATION  AND  NORMALIZATION  rules  are 
replacement  rules  of  the  type  A  =>  B,  i.e.,  replace 
A  with  B.  Both  A  and  B  can  be  lists  of  semantic 
elements.  A  slash  (/)  indicates  that  the  next 
elements  of  the  list  occur  on  the  next  line  of  the 
mode  buffer.  Class  names  and  X's  are  used  as  vari¬ 
ables,  and  in  the  NORMALIZATION  rules  A's  are  vari¬ 
ables  which  stand  for  knowledge  elements  on  adjacent 
lines  connected  by  the  AND  indicator.  Typical 
INTEGRATION  and  NORMALIZATION  rules  for  crypt¬ 
arithmetic  are  shown  in  Figure  6.  GROUPING  rules  are 
not  shown.  They  define  a  protogroup  to  be  the 
largest  consecutive  sequence  of  elements  containing 
no  more  than  one  operator  element. 

Group  Processor.  There  are  two  run  modes  in  the 
Group  processor:  GRAPHIC1,  and  GRAPHIG2.  CRAPH1G1 
processing  fills  in  the  values  of  variables  in  the 
semantic  elements  by  comparing  the  element  containing 
variables  with  all  the  elements  currently  active  in 
the  PBG,  i.e.,  the  current  context.  When  a  match  is 
found  the  appropriate  values  are  filled  in.  Currently 
the  UNKNOWNS  rules  are  not  accessible  to  the  user. 

P£.?CGSsinR  in  GRAPHIC 2  is  a  joint  man-machine 
effort.  Tlie  goal  is  to  hypothesize  for  each  knowl¬ 
edge  element  its  origin,  i.e.,  the  operator  and  its 
inputs  (and  the  operators  that  produced  those  inputs, 
etc.)  that  produced  that  knowledge  element  as  output. 

The  system  queries  the  user  asking  for  possible 
operators  and  inputs  that  could  have  produced  the 
element  whose  origin  is  being  sought.  From  this 
information  the  system  constructs  an  origin  tree, 
and  hypothesizes  which  path  through  the  tree  repre¬ 
sents  the  actual  origin  of  the  element.  The  path  is 
picked  on  the  basis  of  the  agreement  between  the 
hypothesized  inputs  and  the  actual  context  defined  by 
the  current  PBG.  The  ORIGIN  rules,  like  the  GROUPING 
and  UNKNOWNS  rules,  are  currently  not  accessible. 

PBG  Frocessor.  The  PBG  processor  contains  one 
run  mode:  GRAPH1G3.  In  the  GRAPH1G3  mode,  processing 
consists  of  taking  the  operator  groups  produced  in 
GKAPH1G2  and  incorporating  them  into  the  problem 
behavior  graph.  The  CONFLIGT  rules  are  used  to  deter¬ 
mine  whether  or  not  any  knowledge  elements  in  the 
operator  groups  conflict  with  knowledge  already  in  the 
PBG.  If  such  a  conflict  occurs,  the  PBG  rules  are 
used  to  restructure  the  PBG  so  the  conflict  is 
eliminated . 


SPAvaE  rule  8  in  Figure  6  is  an  exception.  It 
defines  a  set  named  <V>  containing  two  members, 
the  class  <XETTER>  and  the  class  XARRYX 

Two  parsers  are  available,  a  simple  top  down 
parser  and  a  more  sophisticated  parser  written 
by  M.  Rychener. 


At  the  current  stage  of  development  the  Gr  ouping 
rules  have  not  been  made  accessible  to  the  user. 

*/r;V 

This  is  the  major  place  where  we  have  not  regained 
in  PAS-II  the  power  for  automatic  processing 
available  in  PAS-1. 
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l?o th  the  CONFLICT  and  PBG  rules  are  ordered 
production  rules  of  the  forms  >  A,  i.e.,  in  situation 
S  take  action  A  (12,  13).  A  situation  is  defined  by 
a  list  of  values  of  certain  variables,  called  the 
state  vector,  SV.  The  left  side,  of  each  production 
rule  has  the  form  (V^  V,,  V  ...  ),  where  V  repre¬ 
sents  a  permissible  value  lor  the  nth  state  vector 
variable.  The  right  side  has  the  form  (A^  A?  A^  ...), 
where  the  A's  represent  actions  to  be  taken.  The  cur¬ 
rent  values  of  the  state  vector  variables  are  compared 
with  the  left  side  of  each  production  rule.  The  first 
match,  from  top  to  bottom,  determines  the  actions  to 
be  taken  (an  asterisk  is  considered  to  match  any  value)  . 

Figure  6  shows  CONFLICT  and  PBG  rules  for 
cryptarithmetic  .  The  CONFLICT  rules  determine 
whether  or  not  two  given  knowledge  elements  conflict. 

The  example  CONFLICT  state  vector  contains:  (SAME  2), 
which  is  true  (T)  if  the  second  items  of  both  the 
elements  are  identical  and  false  (F)  otherwise; 

(ITEM  11),  which  returns  as  a  value  the  first  item 
of  the  first  element  (the  element  in  the  PBG! ;  and 
(ITEM  12),  which  returns  as  a  value  the  first  item 
of  the  second  element  (the  element  in  the  group). 

Thus  if  the  two  elements  being  compared  were  (ODD  R) 
and  (NEQ  R  5)  CONFLICT  rule  3  would  match  the  state 
vector  and  the  decision  would  be  that  no  conflict 
exists. 

The  PBG  rules  determine  the  cype  of  restruc¬ 
turing  that  occurs  once  a  conflict  is  detected.  The 
PBG  state  vector  in  Figure  6  has  2  variables:  TYPE, 
which  has  the  vaiue  CON  if  restructuring  is  ba^ed  on 
conflict  and  SIM  if  it  is  based  on  similarity;  an 3 
(ITEM  1  2),  which  is  defined  above.  The  actions  shown 
in  Figure  6  are  BLOCKREJ,  a  type  of  restructuring 
where  blocks  of  adjacent  nodes  are  abandoned,  and 
COPY,  a  specification  that  the  group  causing  the 
restructuring  should  remain  in  the  active  portion  of 
the  PBG  after  restructuring.  The  state  vectors  for 
CONFLICT  and  PBG  may  contain  variables  and  actions 
other  than  the  ones  shown  in  Figure  6.  For  a  complete 
description  of  these  rules  see  the  PAS-II  reference 
manual  (16) . 

Trace  Processor.  The  Topic,  Linguistic, 

Semantic,  Group  and  PBG  processors  comprise  the  major 
portion  of  FAS-II.  It  is  this  portion  which  repre¬ 
sents  a  generalized  version  of  PAS-I.  The  Trace 
processor  is  a  new  extension  to  the  system  and  has  no 
analogue  in  PAS-I.  Some  parts  of  it,  like  the  MATCH 
mode,  are  still  under  development.  The  Trace  proc¬ 
essor  enables  the  user  to  write  a  production  system 
model  of  the  subject  (6)  ,  and  then  compare  the  trace 
obtained  by  running  the  production  system  model  with 
the  PBG  obtained  by  analyzing  the  protocol.  The 
details  are  described  elsewhere  (16)  . 

4.  Example  of  Program  Operation 

To  illustrate  the  use  of  PAS-II,  we  present  a 
listing  of  the  actual  user-machine  interaction  in¬ 
volved  in  the  on-line  analysis  of  a  short  crypt¬ 
arithmetic  protocol.  The  cryptarithmetic  task  is 
given  in  Figure  5.  Boih  the  protocol  and  the  crypt¬ 
arithmetic  rules  used  for  this  example  are  shown  in 
Figure  6.  The  protocol  is  stored  in  the  TEXT  mode 
and  the  cryptarithmetic  rules  in  the  eight  rules  modes 
shown.  These  rules  approximate  the  minimal  set  needed 


The  PBG  rules  are  also  used  for  restructuring  when 
similarities  (identical  nodes)  are  detected,  as 
discussed  in  an  earlier  paper  on  PAS-I  (15). 


to  analyze  thegiven  protocol,  and  are  for  expository 
purposes  only." 

The  annotated  listing  is  shown  below.  The  user 
input  is  in  lower  case  and  the  system  output  in  upper 
case.  The  system  prompts  the  user  by  typing  either 
an  asterisk  or  a  question  followed  by  a  question 
mark  (?). 


*1ext  display 
TEXT  MOOE 

1.  D  IS  5  I  THEREFORE  T  IS  0  .  ASSUME  R  EQUALS  4  .  SINCE  YOU 
CRRRY  1  ,  R  IS  000  .  RSSUME  R  IS  7  ,  NOT  S  . 

«next  go 

TOPIC  MOOE 
1.  0  IS  S  | 

THEREFORE  T  IS  0  . 

RSSUIIE  R  EQURtS  4  . 

SINCE  YOU  CRRRY  1  , 

R  IS  OOP  . 

RSSUME  R  IS  7  , 

NOT  5  . 

Of  ’  yes 

TOPIC  MODE  FINISHEO 
$nexi  yo 

LINGUISTIC1  MOOE 

1.  D  IS  5  j 

2.  THEREFORE  T  IS  0  . 

3.  ASSUME  R  EOURLS  4  . 

4.  SINCE  YOU  CRRRY  1  , 

5.  R  IS  000  . 

B.  RSSUME  R  IS  7  , 

7.  NOT  S  . 

OK?  yes 
fnext  go 


DONALD  I)  =  5 
+  G  E  R  A  L  D 


ROBERT 

The  above  expression  is  a  simple  arithmetic  sum  in 
disguise.  Each  letter  represents  a  digit,  that  is, 
0,  1,  2,  ...,  9.  Each  letter  is  a  distinct  digit. 
You  are  given  that  D  represents  the  digit  5;  thus, 
no  other  letter  may  be  5. 

What  digits  should  be  assigned  to  the  letters  such 
that  when  the  letters  are  replaced  by  their  corres¬ 
ponding  digits  the  above  expression  is  a  true 
arithmetic  sum? 


Figure  5.  Cryptarithmetic  Task 


The  user  first  entered  the  TEXT  mode  and  dis¬ 
played  its  contents.  He  then  entered  the  next  mode 
in  the  control  cycle,  TOPIC,  and  started  processing 
by  typing  GO.  This  caused  the  SEGMENTATION  rules  to 
be  applied  to  the  data  in  TEXT.  The  system  indicated 
that  the  data  in  line  1  of  the  previous  mode  had  been 
transformed  into  the  seven  lines  shown  above,  and 
asked  if  this  transformation  was  satisfactory  (OK?) . 
At  this  point  the  user  typed  yes,  telling  the  system 
to  actually  put  those  seven  lines  into  the  next  seven 


At  least  four  times  as  many  rules  would  be  needed 
for  a  complete  set  '15) . 
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TEXT  MODE 

.  D  IS  5  ;  THEREFORE  T 
CARRY  1  ,  R  IS  ODD 


IS  0  .  ASSUME  R  EQUALS  4  .  SINCE  YOU 
ASSUME  R  IS  7  ,  NOT  5  . 


SPACE  RULES 

1.  (NEG)  IND 

2.  (ODD  <v>)  KN 

3.  (EQ  <V>  <.DIGIT>)  KN 

4.  (THEREFORE)  IND 

5.  (BECAUSE)  IND 

6.  (ASSUME)  IND 

7.  (DIGIT  <DIGIT>)  KN 

8.  «V>  <LETTER>  <CARRY>)  SPASET 


GRAMMAR  RULES 

1  •  <EQ>  =  «CARRYEQ»  «LETTER>  *  <EQUAL>  *  <DIGIT» 

2.  <CARRYEQ>  =  «CARRY>  *  <DIGIT»  «CARRY» 

3.  <ODD>  =  «LETTER>  *  <EQUAL>  *  ODD) 

4.  <EOUAL>  =  IS  EQUAL  EQUALS  BE  WAS  ARE 

5.  <NEG>  =  CANNOT  NOT  NO  N’T 

6.  <THEREFORE>  =  THEREFORE  IMPLIES 

7.  <ASSUME>  =  ASSUME  ASSUMING 

8.  <BECAUSE>  *  BECAUSE  SINCE 

9.  <CARRY>  =  CARRY  CARRYING  CARRIED 

10.  <LETTER>  =  ABDEGLNORT 
1  1.  <DIGIT>  =  0123456783 


SEGMENTATION  RULES 

1.  ./ 

2.  ;  / 

3.  <DIGIT>  ,  / 

4.  <LETTER>  ,  / 


EXTRACTION  RULES 

1.  12 


integration  rules 

1.  (XI  CARRY  X 2)  =>  (XI  <C>  X2) 

2.  (EQ  XI  X2)  /  (DIGIT  X3)  =>  (EQ  XI  X2)  /  (EQ  XI  X3) 

3.  (NEG)  (EQ  <LETTER>  <DIGIT»  =>  (NEQ  <LETTER>  <DIGIT» 

4  (ASSUME)  (EQ  <LETTER>  <DIGIT»  =>  (AEQ  <LETTER>  <DIGIT» 

NORMALIZATION  RULES 

1  -  A 1  /  (THEREFORE)  A2  =>  (BECAUSEOF  A1  A2) 

2.  (BECAUSE)  A1  /  A2  =>  (BECAUSEOF  A1  A2) 

CONFLICT  RULES 

1  SV  =  ((SAME  2)  (ITEM  1  1 )  (ITEM  1  2)) 

2.  (F  *  *)  =>  NO-CON 

3.  (*  ODD  NEQ)  =>  NO-CON 

4.  (*  *  *)  =>  ASK-IF-CON 

PBG  RULES 

1.  SV  =  (TYPE  (ITEM  1  2)) 

2.  (CON  NEQ)  =>  BLOCKREJ 

3  (CON  #)  =>  (BLOCKREJ  COPY) 

4.  (*  *)  =>  BLOCKREJ 


Figure  6.  Cryptarit'.metlc  Rules. 

to. 


lines  oi  the  TOPIC  buffer.  If  the  processing  ha d 
been  unsatisfactory,  the  user  could  have  jumped  to 
the  SEGMENTATION  mode,  changed  the  rules,  lumped 
back  to  TOPIC,  and  reprocessed  the  data  using  the  new 
rules  before  proceeding  with  the  next  processing  step. 

The  user  then  entered  the  next  mode,  LINGUISTIC l , 
and  star  Led  processing.  The  EXTRACTION  rules  were 
applied  to  the  seven  lines  of  data  in  TOPIC  and  the 
system  indicated  that  the  processing  should  consist  of 
placing  these  lines  in  LINGUISTIC l  unchanged.  Note 
that  the  system  indicated  that  line  l  from  TOPIC  was 
transformed  into  a  single  line  in  LINGUISTIC  1,  etc., 
as  opposed  to  the  previous  step  where  one  line  in  TEXT 
was  transformed  into  seven  lines  in  TOPIC. 


LINGUISTIC?  HOOE 
«E0>  -LETTER*  0 
-EQUAL*  IS 
<  0 1 G I T  >  S 

1.  IFD  0  S) 

FROM  s  0  IS  S  } 

OK?  ges  batch  suppress 

BATCH « T 

5UPPRES5*T 

2.  (EO  T  0)  <  THEREFORE  > 

FROM  t  THEREFORE  T  IS  0  . 

3.  (EO  R  A)  (ASSUME) 

FROM  :  ASSUME  R  EQUALS  A  . 

A.  (EO  CARRY  1)  (BECAUSE) 

FROM  s  SINCE  YOU  CARRY  1  , 

5.  (000  R) 

FROM  v  R  IS  000  . 

6.  (E0  R  7)  (ASSUME) 

FROM  ;  ASSUME  R  IS  7  , 

7.  (NEG>  (DIGIT  S) 

FROM  ;  NOT  S  . 

LINGUISTICS  M00E  FINISHE0 

* (batch  () (suppress  f)  automatic 

BATCH: F 

5UPPRE3SsF 

AUTOMATICS 

♦next  go 


Processing  in  LINGUISTICS  consisted  of  applying 
Lhe  SPACE  and  GRAMMAR  rules  to  the  data  in  LINGUISTIC  1 
to  produce  a  parse.  In  step  l  the  parse  tree  was 
printed  and  t lie  user  set  the  flag  BATCH  true  to 
eliminate  Lhe  OK?  question  (the  system  then  assumes 
the  answer  is  always  yes)  and  the  flag  SUPPRESS  true 
to  eliminate  further  printing  of  the  parse  trees. 

Then,  before  going  to  the  next  mode  in  the  control 
cycle,  the  user  set  the  flag  AUTOMATIC  true  so  the 
system  would  automatically  step  through  the  appropriate 
run  nodes  executing  GO.  At  this  point  the  LINGUISTICS 
buffer  held  the  seven  sets  of  semantic  elements  shown 
above . 


SEMANTIC!  MODE 

RULES  APPLIED  s  A  1  2  A  3 

1.  (E0  0  5) 

2.  (ED  T  0)  (THEREFORE) 

3.  (AEO  R  A) 

A.  (BECAUSE)  (E0  <C>  1) 

s.  <noo  R) 

6.  (AEO  R  7) 

7.  (NED  R  S) 

01.  ?  yns 

SEMflNT  IC  I  M00E  FINISHED 


semantic?  nooE 
RUIES  APPLIED  i  1  ? 

1-7.  ibecausiof  keo  d  s>>  keq  t  o>>> 
inr q  r  «> 

(BECAUSEOr  ((ED  <C>  1))  ((000  R) ) > 
(AEQ  R  7) 

(NED  R  S) 

OFT  yos 

semant IC2  nooE  finish'-" 
semaneics  nooE 

1.  (BECAUSE  OF  KEO  OS))  KEO  T  0)1) 

2.  (RE0  R  4) 

3.  (RECAUSE0F  KEO  .C>  If)  K000  R))) 

4.  (AEO  R  7) 

5.  (NE0  R  S) 

OF  7  yB5 


Processing  in  SEMANTIC!  consisted  oi  applying  the 
INTEGRATION  rules  to  the  semantic  elements  In 
I.INGU1 STIC2 .  As  indicated  above  there  were  five 
applications  of  the  rules.  Processing  in  SEMANTIC’/, 
consisted  of  applying  the  NORMAL! /CAT ION  rules  to  the 
seven  sets  of  elements  in  SKMANTICI.  There  were  two 
applications  of  the  rules,  and  five  sets  of  elements 
were  left  in  SKMANTIC2.  Processing  in  SEMnl,"’lC3  con¬ 
sisted  of  applying  the  GROUPING  rules,  which  are  not 
explicit.  These  rules  simply  attempted  to  pull  from 
SEMANTIC?  one  operator  element  and  its  associated 
knowledge  elements.  Since  no  operator  elements  were 
presenL,  iL  pulled  all  the  elements  from  SEMANT1C2. 


GRAPH  1 C 1  MODE 

1.  (BE  CAUSE  OF  KEO  0  SO  KEO  T  0  )) 

FROM  .  (BECAUSE OF  KEO  0  SI)  KEO  T  0))) 

OF  7  y0S 

2.  (AEO  R  4) 

TROM  i  (AEO  R  4) 

OF.  7  y/»s 

3.  (RECnUSEOF  ((E0  <C>  III  KOOO  R>>> 

FROM  i  (RECAIISEOF  KEO  .C>  ID  K00D  RID 
01  ?  yes  batch  suppress  r:  (beefluseot  Keq  c2  1  > )  ( (odd  r)II 
BATCH*! 

00  YOU  REALLY  UANT  BOTH  AUTOMATIC*!  AND  BATCH*!  7  y,s 
SUPPRESS*T 


4  . 

(BED  R 

7) 

FROM  : 

(RF0  R 

7) 

5. 

(NE0  R 

5) 

FROM  t 

(NE0  R 

5) 

GRAPHIC!  MCDE  FINISHED 


Processing  in  CRAPII1C1  consisted  of  applying  the 
UNKNOWNS  rules,  which  are  not  explicit.  These  rules 
involve  searching  the  existing  PEG  for  elements  that 
match  the  elements  containing  unknowns.  In  this 
simple  example  no  matches  were  found  because  the  PBC. 
had  not  ye’  been  grown.  Titus,  in  step  3  when  the 
unknown  carry  <3C>  was  not  found,  the  u  er  told  the 
system  to  replace  its  processing  result  with 
(BKCAUSEOK  ((  KQ  C2  1))  ((ODD  Rf )  ).  fills  was  put 
into  line  3  of  the  GRAPHIC!,  buffer,  rather  than  the 
result  containing  <C"’.  In  effect  the  user  told  the 
system  .bat  the  value  of  <C^  was  C2,  i.e.,  that  the 
unknow  i  carry  was  the  carry  into  the  second  column 
(the  1,+l-R  column')  . 

Process1  ng  in  GRAPHIC?  and  GRAP1UC3  occurred  as 
follows:  GRAPHIC?  was  entered  and  the  elements  from 
line  1  of  GRAPHIC l  were  processed  interactively  to 
determine  their  operator  groups.  GRAPHIC?  was  then 
entered  and  these  groups  were  grown  as  new  nodes  in  the 
PBG.  Next  GRAPHIC?  was  reentered  and  the  elements 


II 


from  line  2  of  GRAPHIC l  processed.  This  graphic2- 
graphicl  loop  was  repeated  for  each  line  in  GRAP11ICI. 
Below  is  shown  only  one  of  these  loops":  processing 
and  growing  the  elements  from  line  3  of  GRAPHIC  1. 


GRAPHIC?  H00E 

FOR  (BECnuSEOF  ( <EQ  C2  1)1  ((000  KM)  : 

OP  =  (pc  2) 

C’JTPUTS  e  (odd  r) 

INPUTS  =  (eq  c2  1) 

FOR  (10  C2  1)  1 
OP  =  lav  c2) 

INPUTS  ~ 

OTHER  ORIGINS  FOR  (E0  C2  1)  ?  yes 
FOR  (EQ  C?  I)  : 

OP  =  (pc  1) 

INPUTS  =  (oq  d  S)  (eq  cl  0) 

(EQ  0  S)  F0UN0  IN  PBG 
(E0  Cl  0)  F0UN0  IN  PBG 
OTHER  ORIGINS  FOR  (EQ  C2  1)  >  no 
ORIGIN  TREE  s 

(000  R)  (PC  2)  (EQ  C2  1)  (RV  C2) 

(PC  1)  (EQ  0  S) 

(EQ  Cl  0) 

3.  (PC  1)  ((EQ  0  S)  (EO  Cl  0) )  (EQ  C2  1) 

(PC  2)  ((EO  C2  1M  (000  Rl 

FROtl  :  (BECBUSEOF  ((EQ  C2  1M  ( (0Q0  R)M 

GR1PHIC3  '100E 

1.  GROW  (FQ  C2  1) 

FROh  (PC  I)  ((EO  0  S)  (EQ  Cl  QM  (EQ  C2  I) 

DO  (REO  R  4)  RND  (000  R)  CONFLICT  >  yes 

2.  CONFLICT:  N4  (REQ  R  4)  RNO  (OOQ  R)  UITH  (BLOCKREJ  COPT) 

FROh  :  IPC  21  ((EQ  C2  ID  (OQO  R) 

GRHPF.IC3  tlOOE  F1NI5HE0 


In  GRAPIIIC2  the  system  queried  the  user  to  deter¬ 
mine  possible  origins  (operators  and  their  inpuLs)  for 
Che  elements  in  question.  This  information  was 
represented  as  an  origin  tree  as  shown  above.  This 
tree  is  displayed  below  in  a  more  conventional  style. 


wiiere  an  input  is  "used"  if  it  occurs  In  the  PBG. 
Thus  (AV  C 2 )  has  a  rating  if  0  while  (PC  I)  has  a 
rating  of  (3x2)-0  or  6.  The  format  of  the  operator 
groups  produced  in  GKAPI13C2  is:  operator  (input 
list)  output . 


In  GRAPHIC)  the  iwo  groups  from  GRAPHIC 2  were 
incorporated  into  the  PBG.  The  second  group,  with 
(ODD  R)  as  the  output,  conflicted  with  an  existing 
group  in  the  PBG  and  led  to  restructuring  of  the  PBG 
to  resolve  the  conflict.  Conflicts  were  defined  by 
tlie  CONFLICT  rules,  the  type  of  restructuring  by 
the  PBG  rules." 


-->graphic3  display 


GRAPHIC3  tlOOE 

ni  e 

OP 

(RECALL 

0) 

OUT  (EQ  0  5) 

N 

OP 

(RECAlt 

Cl) 

OUT  (E0  Cl  Q) 

N3 

OP 

cpr 

») 

IN 

(E0  0  S)  (EQ  Cl 

0) 

OUT 

(EQ 

T 

9) 

N4 

OP 

(AV 

’) 

OUT 

(AEQ  R  4) 

NS 

OP 

(PC 

1) 

IN 

(EQ  0  S)  (EQ  Cl 

0) 

QUT 

(EQ 

C2 

1) 

N6 

OP 

(PC 

2 

IN 

(EQ  C2  1)  OUT 

(OQQ 

R) 

9 

N7  3 

OP 

(PC 

1) 

IN 

(EQ  Q  S)  (EQ  Cl 

Q) 

OUT 

(EQ 

C2 

1) 

NS 

OP 

(PC 

2) 

IN 

(EG  C  1)  QUT 

(Q0Q 

R) 

N9 

OP 

(AV 

R) 

OUT 

(AEQ  R  7) 

N1Q 

OP 

(TO 

R  51 

1  IN  (EQ  Q  S>  OUT 

(NEQ  R  ! 

5) 

After  all  the  data  from  GRAPHIC l  was  processed 
in  GRAP1UC2  and  GRAP1I1C3  the  contents  of  GRAPH  1C 3 
were  displayed.  Each  line  in  the  display  represents 
a  node  in  the  PBG,  Node  10  contains  the  operator: 
test  to  see  if  R  can  have  the  digit  5  as  a  value, 

(ID  R  5).  1  igi -e  8  shows  this  PBG  in  the  conven¬ 

tional  representation .  Note  that  the  conflict  between 
(AEQ  R  4)  and  (ODD  R)  led  to  a  back-up  that  abandoned 
nodes  4,  5  and  6.  Thus  the  currently  active  nodes, 
the  ones  that  define  the  current  context,  are  those 
joined  by  the  heavy  lines  in  Figure  8. 


output : 
operators : 
input  output: 
operators: 
input : 


(ODD^R) 

(PC  2) 

(E0  1-2  .1) 

(AV  C2)  ^(pfc,  ^ 

(EQ  D  5  (E0  Cl  0) 


Figure  7.  Origin  Tree 


5.  Discussion 


The  initial  program,  PAE-1,  is  an  artificial 
intelligence  program  by  any  reasonable  criteria.  The 
task  it  attempts,  the  inference  from  verbal  behavior 
to  Problem  Behavior  Graph,  is  a  task  requiring  intel¬ 
ligence  when  done  by  humans.  The  mechanisms  used  are 
those  common  to  other  artificial  intelligence 
programs  that  tackle  somewhat  similar  tasks:  grammars 
to  deal  with  the  surface  structure  of  natural  language, 
i epresentation  of  knowledge,  matching,  and  heuristic 
search  to  infer  information  not  directly  expressed  in 
the  utterances. 


The  system  analyzes  the  tree  and  decides  which  path 
represents  the  best  origin  for  the  top  element,  in 
this  case  (ODD  R) .  Here  there  are  only  two  alter¬ 
natives:  the  path  with  the  operator:  assign  a  value 

to  the  carry  into  column  2,  (AV  C2) ,  and  the  path 
with  the  operator:  process  column  1,  (PC  1).  The 
system  chooses  the  latter,  based  on  implicit  ORIGIN 
rules  which  tell  it  to  choose  between  operators  by 
rating  them  according  to  their  inputs.  The  decision 
function  currently  in  use  is: 


PAS-II  is  a  program  that  accomplishes  the  same 
task  as  PAS-1.  Hence,  it  too  is  an  artificial  intel¬ 
ligence  program.  But  when  looked  at  structurally  it 
more  closely  resembles  a  data  processing  framework 
or,  possibly,  a  language.  Something  has  happened  in 
going  from  PAS-1  to  PAS-11,  something  worth  identi¬ 
fying  and  discussing. 

Let  us  start  with  Planner  (3)  and  QA4  (8) 

These  systems  are  languages  for  writing  programs  to 
perform  a  class  of  artificial  intelligence  tasks.  The 


Choose  to  maximize:  (3  x  used-inputs) 

(unused -inputs) 


Conflict  and  PBG  rules  are  described  in  detail  in 
an  earlier  paper  (15)  . 


** 


I'l 


There  are  other  representatives  of  this  class, 
e.g.,  POPLER  (1)  and  Conniver  (10,  11). 


Space  ’imitations  prevent  us  from  including  the 
entire  listing. 


exact  boundaries  of  Lhese  tasks  are  obscure  lui*  their 
central  core  is  clear  and  includes  a  large  fraction  of 
the  Lasks  for  winch  heuristic  programs  have  been  built 
--  theorem  proving,  roi.  >L  planning,  symbolic  manipu¬ 
lation,  etc.  these  systems  were  formed,  essentially, 
by  taking  a  lisL  processing  framework  and  embedding 
within  it  some  of  the  ad  hoc  mechanisms  developed 
tor  particular  heuristic  programs.  They  include  back¬ 
tracking,  a  generalized  matching  facility,  a  global 
data  base  (accessed  by  patlern  matching)  and  multi¬ 
processing  control.  Kinbedding  these  mechanisms  with¬ 
in  a  language  makes  possible  their  use  in  novel  com¬ 
binations  (and  in  interaction  with  the  other  mecha¬ 
nisms  available  in  higher  languages). 

This  same  embedding  of  mechanisms  inLo  a  language 
system  has  occurred  in  the  transition  from  PA.S-I  to 
PAS-11.  PAS-11  provides  a  framework  within  which  a 
class  of  Al  programs  ran  be  easily  constructed.  This 
class  is  not  the  same  as  that  of  the  Planner/()A4 
type  system,  which  is  more  "mainline"  artificial 
intelligence.  Rather,  it  appears  to  be  characterized 
as  linguistic  data  processing,  the  essential  feature 
being  the  processing  of  long  sequences  of  data 
(ratner  than  just  a  sentence  at  a  time).  This  class 
includes,  of  course,  protocol  analysis.  It  also 
includes  a  number  of  other  tasks:  content  analysis 
of  more  classical  varieties  (9),  problem  space  con¬ 
struction  (2),  test  grading,  and  what  is  coming  to  be 
called  semantic  filtering. 

The  embodiment  of  mechanisms  into  a  language 
framework  has  occurred  at  two  levels  in  PAS-11,  one 
corresponding  roughly  to  that  of  Planner /QA4  and  the 
other  more  specialized.  The  first  level  is  repre¬ 
sented  by  the  PAS-11  framework  of  run  modes,  rule 
modes,  common  command  language,  editing  system,  and 
control  structure.  This  includes  a  set  of  mecha¬ 
nisms  for  the  data  base  (the  run  modes),  a  matching 
facility  (the  common  mechanism  for  how  the  rules  work 
on  data),  and  a  backtra>  k  facility  (the  saving  of 
buffers  so  that  processing  can  be  undone).  Added  to 
this  is  the  explicit  control  structure  for  processing 
within  a  stage  and  passing  through  the  stages,  which 
corresponds  to  a  weak  method  (4)  in  the  same  sense 
as  GPS's  basic  methods  or  the  basic  methods  built  into 
the  goal  construct  in  Planner  4)A4.  These  pnvide  a 
schema  of  operation  which,  though  almost  content  free, 
is  still  a  rational  procedure  for  achieving  the 
overall  goal.  The  mechanisms  adopted  in  PAS-II  are 
somewhat  more  shaped  than  their  correspondents  in 
Planner /QA4 ,  e.g.,  there  is  not  a  single  global  data 
base  or  one  stratified  by  a  general  context  mechanism, 
rather  the  data  is  organized  into  homogeneous  groups 
(the  modes)  along  structural  lines. 

The  second  level  is  the  specialization  of  the 
various  modes  to  specific  subtasks  inherent  in  tasks 
of  the  class:  segmentation,  parsing,  normalization, 
etc.  The  specialized  rule  systems  contain  the  knowl¬ 
edge  about  the  processing.  Thus  writing  any  sort  of 
legal  rules  within  a  given  rule  system  generates  proc¬ 
essing  of  the  right  sort  (though  it  may  not  do  the 
right  task).  In  this  respect  providing  a  single  gener¬ 
alized  rule  system  or  scheme  for  pattern  matching  and 
pattern  evoked  actions  (in  the  manner  of  Planner/QA4) 
would  move  more  of  the  knowledge  required  back  across 
the  boundary  from  the  language  system  (PAS-IB  to  the 
coding  within  the  system  (the  user  program  in  PAS-11, 
which  is  the  set  of  actual  rules  in  the  rule  modes)  . 

As  one  moves  PAS-II  in  the  direction  of  a 
generalized  system  for  a  wider  class  of  problems,  one 
can  expect  the  collection  of  rule  modes  to  increase, 


bt  oming  eventually,  a  library  in  the  classic  sub¬ 
routine  library  sense.  The  system  designer  is  then 
faced  with  the  problem  of  providing  these  modes  with 
the  rules  needed  to  define  processing  in  the  various 
problem  domains.  However,  one  advantage  of  spec¬ 
ialized  rule  systems  is  that  when  their  structure 
is  highly  constrained  it  becomes  easy  to  predict  Lhe 
effect  of  modifying  rules  in  the  system  (a.;  compared 
to  predicting  the  effect  of  modifying  statements  in 
a  general  programming  language.' .  This  sets  the  stage 
for  Lhe  development  of  self-modifying  systems  which 
rewrite  their  o„i  rules  or,  in  effect,  learn  to 
improve  their  performance  in  some  data  processing 
task  (12,  13).  Such  a  capability  in  an  interactive 
PAS-li-like  system  would  enable  the  system  to  build 
or  modify  its  own  rules  for  a  particular  problem 
domain,  using  feedback  from  tue  user  to  direct  the 
search  for  good  sets  of  rules. 

The  evolution  from  PAS-I  to  PAS-II  in  analogy 
to  the  more  general  evolution  going  on  toward 
planner-like  language  systems  should  add  to  the 
awareness  that  embedding  mechanisms  in  language 
remains  a  potent  scheme  for  making  advances  in 
artificial  intelligence. 
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